148 research outputs found

    From gene trees to species trees II: Species tree inference in the deep coalescence model

    Full text link
    When gene copies are sampled from various species, the resulting gene tree might disagree with the containing species tree. The primary causes of gene tree and species tree discord include lineage sorting, horizontal gene transfer, and gene duplication and loss. Each of these events yields a different parsimony criterion for inferring the (containing) species tree from gene trees. With lineage sorting, species tree inference is to find the tree minimizing extra gene lineages that had to coexist along species lineages; with gene duplication, it becomes to find the tree minimizing gene duplications and/or losses. In this paper, we show the following results: (i) The deep coalescence cost is equal to the number of gene losses minus two times the gene duplication cost in the reconciliation of a uniquely leaf labeled gene tree and a species tree. The deep coalescence cost can be computed in linear time for any arbitrary gene tree and species tree. (ii) The deep coalescence cost is always no less than the gene duplication cost in the reconciliation of an arbitrary gene tree and a species tree. (iii) Species tree inference by minimizing deep coalescences is NP-hard.Comment: 17 pages 6 figure

    On Tree Based Phylogenetic Networks

    Full text link
    A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree based networks. Reticulation-visible networks and child-sibling networks are all tree based. In this work, we present a simply necessary and sufficient condition for tree-based networks and prove that there is a universal tree based network for each set of species such that every phylogenetic tree on the same species is a base of this network. The existence of universal tree based network implies that for any given set of phylogenetic trees (resp. clusters) on the same species there exists a tree base network that display all of them.Comment: 17 pages, 6 figure

    Bounding the Size of a Network Defined By Visibility Property

    Full text link
    Phylogenetic networks are mathematical structures for modeling and visualization of reticulation processes in the study of evolution. Galled networks, reticulation visible networks, nearly-stable networks and stable-child networks are the four classes of phylogenetic networks that are recently introduced to study the topological and algorithmic aspects of phylogenetic networks. We prove the following results. (1) A binary galled network with n leaves has at most 2(n-1) reticulation nodes. (2) A binary nearly-stable network with n leaves has at most 3(n-1) reticulation nodes. (3) A binary stable-child network with n leaves has at most 7(n-1) reticulation nodes.Comment: 23 pages, 9 figure

    Locating a Tree in a Reticulation-Visible Network in Cubic Time

    Full text link
    In this work, we answer an open problem in the study of phylogenetic networks. Phylogenetic trees are rooted binary trees in which all edges are directed away from the root, whereas phylogenetic networks are rooted acyclic digraphs. For the purpose of evolutionary model validation, biologists often want to know whether or not a phylogenetic tree is contained in a phylogenetic network. The tree containment problem is NP-complete even for very restricted classes of networks such as tree-sibling phylogenetic networks. We prove that this problem is solvable in cubic time for stable phylogenetic networks. A linear time algorithm is also presented for the cluster containment problem.Comment: 25 pages, 3 figure

    Generating Normal Networks via Leaf Insertion and Nearest Neighbor Interchange

    Full text link
    Galled trees are studied as a recombination model in theoretic population genetics. This class of phylogenetic networks has been generalized to tree-child networks, normal networks and tree-based networks by relaxing a structural condition. Although these networks are simple, their topological structures have yet to be fully understood. It is well-known that all phylogenetic trees on nn taxa can be generated by the insertion of the nn-th taxa to each edge of all the phylogenetic trees on nβˆ’1n-1 taxa. We prove that all tree-child networks with kk reticulate nodes on nn taxa can be uniquely generated via three operations from all the tree-child networks with kβˆ’1k-1 or kk reticulate nodes on nβˆ’1n-1 taxa . An application of this result is found in counting tree-child networks and normal networks. In particular, a simple formula is given for the number of rooted phylogenetic networks with one reticulate node.Comment: 4 figures and 13 page

    Counting and Enumerating Galled Networks

    Full text link
    Galled trees are widely studied as a recombination model in population genetics. This class of phylogenetic networks is generalized into galled networks by relaxing a structural condition. In this work, a linear recurrence formula is given for counting 1-galled networks, which are galled networks satisfying the condition that each reticulate node has only one leaf descendant. Since every galled network consists of a set of 1-galled networks stacked one on top of the other, a method is also presented to count and enumerate galled networks.Comment: 7 figures, 2 table

    Locating a Phylogenetic Tree in a Reticulation-Visible Network in Quadratic Time

    Full text link
    In phylogenetics, phylogenetic trees are rooted binary trees, whereas phylogenetic networks are rooted arbitrary acyclic digraphs. Edges are directed away from the root and leaves are uniquely labeled with taxa in phylogenetic networks. For the purpose of validating evolutionary models, biologists check whether or not a phylogenetic tree is contained in a phylogenetic network on the same taxa. This tree containment problem is known to be NP-complete. A phylogenetic network is reticulation-visible if every reticulation node separates the root of the network from some leaves. We answer an open problem by proving that the problem is solvable in quadratic time for reticulation-visible networks. The key tool used in our answer is a powerful decomposition theorem. It also allows us to design a linear-time algorithm for the cluster containment problem for networks of this type and to prove that every galled network with n leaves has 2(n-1) reticulation nodes at most.Comment: The journal version of arXiv:1507.02119v

    Analyzing the Accuracy of the Fitch Method for Reconstructing Ancestral States on Ultrametric Phylogenies

    Full text link
    Recurrence formulas are presented for studying the accuracy of the Fitch method for reconstructing the ancestral states in a given phylogenetic tree. As their applications, we analyze the convergence of the accuracy of reconstructing the root state in a complete binary tree of 2n2^n as nn goes to infinity and also give a lower bound on the accuracy of reconstructing the root state in an ultrametric tree.Comment: 14page

    Counting Tree-Child Networks and Their Subclasses

    Full text link
    Galled trees are studied as a recombination model in population genetics. This class of phylogenetic networks is generalized into tree-child, galled and reticulation-visible network classes by relaxing a structural condition imposed on galled trees. We count tree-child networks through enumerating their component graphs. Explicit counting formulas are also given for galled trees through their relationship to ordered trees, phylogenetic networks with few reticulations and phylogenetic networks in which the child of each reticulation is a leaf.Comment: 24 pages, 2 tables and 9 figure

    The compressions of reticulation-visible networks are tree-child

    Full text link
    Rooted phylogenetic networks are rooted acyclic digraphs. They are used to model complex evolution where hybridization, recombination and other reticulation events play important roles. A rigorous definition of network compression is introduced on the basis of the recent studies of the relationships between cluster, tree and rooted phylogenetic network. The concept reveals another interesting connection between the two well-studied network classes|tree-child networks and reticulation-visible networks|and enables us to define a new class of networks for which the cluster containment problem has a linear-time algorithm.Comment: 18 pages, 4 figure
    • …
    corecore